Evolutionary approach for finding the atomic structure of 
steps on stable crystal surfaces 
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The problem addressed here can be concisely formulated as follows: given a stable surface orien- 
tation with a known reconstruction and given a direction in the plane of this surface, find the atomic 
structure of the steps oriented along that direction. We report a robust and generally applicable 
variable-number genetic algorithm for step structure determination and exemplify it by determining 
the structure of monatomic steps on Si(114)-2 x 1. We show how the location of the step edge with 
respect to the terrace reconstructions, the step width (number of atoms), and the positions of the 
atoms in the step region can all be simultaneously determined. 



One-dimensional (1-D) nanostructures presently show 
tremendous technological promise due to their novel and 
potentially useful properties. For example, gold chains on 
stepped silicon surfaces^ 2 - can have tunable conduction 
properties, rare earth nanowires and bismuth nanolines 
have unusual straightness and length^ and can thus be 
useful as nanoscale contacts on chips or as templates for 
the design of other novel structures. The structure of 
step edges on silicon surfaces is of key interest, for it can 
help trigger a step-flow growth mode^ useful for prepar- 
ing high-quality wafers. Understanding the formation, 
properties, and potential applications of these intriguing 
1-D nanostructures requires knowledge of the atomic po- 
sitions of various possible adsorbate species, as well as of 
the location of the silicon atoms at the step edges. 

The determination of the atomic configuration at crys- 
tal surfaces is a long-standing problem in surface sci- 
ence. With the invention and widespread use of the scan- 
ning tunelling microscope (STM) our understanding of 
crystal surfaces (in particular of surface reconstructions) 
has progressed immensely. Nevertheless, since the STM 
acquires information about the local density of states 
and not about the atomic positions, one still needs to 
find structural models that correspond to the images ac- 
quired. The problem becomes more challenging in the 
case of one-dimensional nanostructures on semiconductor 
surfaces (steps, surface-supported nanowires, atomic or 
molecular chains), because the step and nanowire struc- 
tures may not be readily inferred from STM images given 
the possibility of having either flat supporting surfaces or 
vicinal ones with single- or multiple-layer steps. Even for 
a given direction and step height, the number of possible 
structures is daunting and their identification is tedious 
because it currently relies on relaxing ad-hoc structures 
that may or (more often) may not end up corresponding 
to the experiments. As seen in the case of Au^ 2 - or Ga£ 
on Si surfaces, one needs to propose a large number of 
atomic models and then check if they have sufficiently 
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low formation energies in order to ultimately identify or 
predict them as the actual physical nanostructures. 

Motivated by the need to find good candidates for one- 
dimensional structures on surfaces, we have set out to 
develop a global search procedure that creates and se- 
lects atomic models based on their formation energy. To 
fix ideas, we address here the problem of predicting the 
atomic structure of steps along a given direction on a 
stable surface with a known reconstruction. While we 
focus on the case of straight steps on high-index semi- 
conductor surfaces, we point out that the evolutionary 
procedure with variable atom numbers described below 
is generally applicable for finding the structure of any 
surface-supported 1-D nanostructure, provided that suit- 
able (i.e. fast and reasonably transferable) interatomic 
potentials are available. 

The particular 1-D systems chosen for this study are 
steps on stable high-index semiconductor surfaces [e.g., 
Si(114), Si(5512), Si(113)]^2^ which have a fundamen- 
tal importance in addition to the practical one alluded 
above. Due to the lack of a robust approach for propos- 
ing and sorting step models, the pioneering study^ of 
steps on Si(OOl) was followed by only a few reports of 
step structures on other semiconductor surfaces ! 10 ' 11 ! 12 
Focusing on the case of steps on the Si(114)-2 x 1 sur- 
face, experiments show that straight steps form along 
the [TlO] and the [221] directions-^ which are precisely 
the directions of spatial periodicity of the reconstructed 
Si(114) unit celli^ For each of the two directions we can 
define two types of steps (up and down), and for each 
step type there are two relative positions of the recon- 
structed unit cells on the upper and lower terraces: one 
in which the unit cells on terraces are in registry (normal) 
and another in which they are offset (shifted). A crys- 
tallographic analysis of the Si(114) surface shows that 
out of the four possible terrace configurations for each 
step direction there can be only two that are topologi- 
cal^ distinct. The configurations that we have to address 
are therefore four, denoted here by [TlO]-down, [110]-up, 
[22T]-normal, and [22T]-shifted. Fig. QJi illustrates the 
[TlO]-down configuration, while the remaining ones are 
described in note uA. 

The methodology that we choose for finding the step 
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FIG. 1: (Color online)(a) Step region (shaded) for which the 
number of atoms, their positions, and the location of the step 
edge are to be determined. The step region, which corre- 
sponds to the [110]-down configuration, is surrounded by re- 
constructed terraces with the surface unit cell marked by the 
dashed rectangles, (b) Crossover operation used in the genetic 
algorithm to search the configuration space. The step energy 
calculations involve relaxing a padding zone (in addition to 
the step region) with the supporting zone fixed. 



structures is based on a genetic algorithm, which has 
been shown to achieve fast convergence using aggressive 
multi-particle moves for systems of any dimensionality 
from clusters to bulk crystalline materials (see, e.g., Refs. 

mmH3IHIH0l2l|). In a ratner sim P le but efficient 
way, the algorithm simulates a biological process in which 
a set of individuals evolves with the goal of producing fit 
children, i.e. new step structures with low formation en- 
ergies. For all runs we have systematically kept a pool 
of p — 30 atomic structures and subsequently tested that 
a range of 30 < p < 100 is appropriate for this prob- 
lem, i.e. the pool is large enough that the evolution is 
not likely to get stuck in a local minimum and is suffi- 
ciently small so that the cross-over moves are not wasted 
optimizing mostly high-energy local minima. In general, 
the size of the pool should be determined by numerical 
experimentation for the particular system under study. 
The starting p structures ( "Generation Zero" ) are sim- 
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FIG. 2: (Color online) Finding the structure of [221]-normal 
steps 14 on Si(114). (a) Step energy A of the lowest-energy 
structure (solid line) and averaged across the pool (dashed 
line) during the genetic evolution. The lowest energy struc- 
ture is shown after 0, 200, 500, and 800 crossover operations; 
the atoms subjected to optimization are shown as darker 
spheres in the insets, (b) Evolution of the average number of 
atoms across the pool (dashed line) and of the atom number 
corresponding to lowest-energy member (solid line), (c) Step 
energies before and after the final relaxation of all members 
of the genetic pool. The formation energies decrease upon full 
relaxation unless bonds are broken in the process (as found 
in the case of structure no. 28). 



ply collections of atoms that are randomly positioned in 
the step region (colored gray in Fig. QJi) then relaxed to 
nearest local minima of the potential energy of the sys- 
tem. The evolution from the current generation to the 
next one occurs via crossover processes in which the step 
structures corresponding to two randomly chosen parents 
A and B from the pool are combined to create a new 
(child) structure, C. Referring to Fig. QJ>, the crossover 
of parents A and B is achieved by sectioning them with 
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FIG. 3: (Color online) Low-energy step structures of [TlO]-oriented steps on the Si(114) surface. The atoms subjected to 
optimization are represented by dark spheres, while the atoms making up the terrace reconstructions are the lighter ones. The 
remaining atoms are shown as smaller gray spheres. The structural motifs on the terraces are rows of dimers (d), rebonded 
atoms (r), and tetramers (t). Some or all of these motifs also make up the shown step structures with the exception of the 
most favorable down-step models [panels (a) and (b)] which include hexagon rows denoted by "h" . A schematic contour of the 
step topology was included in each side view to aid the eye. 



the same random plane, then retaining atoms from each 
parent located on different sides of this plane to create 
the child C. The plane is chosen here to be parallel to 
[114] (any polar angle about this direction is allowed) 
and located so as to pass through a randomly chosen 
point in the rectangle projected by the step zone onto 
the (114) plane. The operation so defined (Fig. [TJa) has 
built-in potential to generate child structures with differ- 
ent numbers of atoms than their parents. Any child that 
is structurally distinct from all pool members is considered 
for inclusion in the genetic pool based on its formation 
energy per unit length, which should be lower than that 
of the highest ranked (i.e. least favorable) member of 
the pool. To preserve the total population, the structure 
with the highest formation energy is discarded upon in- 
clusion of a new child. In a genetic algorithm run, the 
crossover operation is repeated to ensure that the lowest 
energy structure of the pool has stabilized; as such, the 
present systems require 2000 operations as illustrated by 
the curves in Fig. [5^ which become flat as the algorithm 
progresses. 

The formation energy of a step structure is defined as 
a per-length quantity that is in excess to the bulk and 
surface energies,-- and therefore can be written as 

A= — (E m -N m e b -jA), (1) 

Ly 

where E m is the total energy of the N m atoms that are 
allowed to move within a projected area A = L x L y with 



the dimension L x (L y ) perpendicular (parallel) to the 
step, et, is the bulk cohesion energy of Si, and 7 is the 
surface energy of the flat Si(114) surface. The poten- 
tial we have chosen to model the atomic interactions is 
the one developed by Lenosky et al.j2£ because it has 
shown reasonable transferability for diverse atomic envi- 
ronments present on high- index Si surfaces i 18 i 20 If all the 
atoms of the supercell are allowed to move when calcu- 
lating the formation energy [Eq. fT}], then each update of 
the genetic pool will be too slow for the algorithm to be 
practical. On the other hand, if we only relax the atoms 
in the step zone, then Eq. |T]) will include not only the 
formation energy but also the elastic interactions of the 
step with the nearby rigid boundaries of the step region. 
To reach a good compromise between the full accuracy of 
Eq. ([!} (which would be achieved when all atoms in the 
supercell are relaxed) and the speed required to sort out 
many structures per unit time, we introduce a padding 
zone that is relaxed along with the step region while the 
keeping the reconstructed support zone fixed (refer to 
Fig. [T|d). The use of the padding zone will diminish the 
elastics interactions in Eq. ([T]) while keeping the number 
of atoms that are relaxed after each crossover small. At 
the end of any genetic algorithm run, a full relaxation (all 
atoms allowed to move) is performed for all pool mem- 
bers in order to refine the step energies. 

Typical results of the genetic algorithm for steps are 
shown in Fig. which displays the evolution of the 
lowest and of the average formation energy of the genetic 
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FIG. 4: (Color online) Results of the genetic algorithm (a,b) 
and experimental observations of steps on Si(114) surfaces 
(c,d). (a) Lowest step energies attained at various atom num- 
bers in the step zone, (b) Final step formation energies of 
the top-ranking structures for each of the four types of steps 
described in text. The [TlO] steps have consistently lower 
energies than similarly ranked [221] structures, (c) STM im- 
age of a vicinal Si(114) surface obtained after cleaning and 
brief annealing (reproduced from Refill with permission of 
the American Institute of Physics), (d) STM image taken 
after flashing at 1225°C followed by 30 minute annealing at 
450°C (courtesy of D. E. Barlow, A. Laracuente, and L.J. 
Whitman). These experimental observations show that the 
[110] steps are preferentially longer than the [221] steps. 



pool as a function of the number of crossover operations. 
Both the lowest and the average formation energies show 
a rapid decrease at the beginning of the evolution, fol- 
lowed by slower decay in the later stages. The lowest- 
energy [22T]-normal configuration was found in less than 
1000 operations, and has been retrieved in four runs 
started from different Generation Zero scenarios with 
no significant change in the total number of crossover 
moves. Since the crossover operation described above 
creates new structures with variable numbers of atoms, 
the number of atoms N in the step region is optimized 
at the same time as the atomic positions (N is always 
smaller than the total number of atoms allowed to relax, 
N m ). Illustrative of the search for the optimal particle 
number N is Fig. O), which displays the evolution of the 
average number of atoms in the genetic pool and the par- 
ticle number corresponding to the lowest-energy member. 
Fig. shows that upon final full relaxation a certain 
amount of energetic reordering does occur, but this re- 
ordering is merely a refinement and does not eliminate 
from consideration any of the best structures deemed fa- 
vorable prior to the complete relaxation of all step models 
in the pool. When using the algorithm for an arbitrary 
line defect, the formation energy comparison before and 
after final relaxation offers the most useful criterion for 
adjusting the size of the padding zone so as to provide 
sufficient relaxation without rendering the calculations 
intractable. 

The best three structures for the up- and down-steps 
oriented along [TlO] are shown in Fig. [3j along with their 
formation energies after the final relaxation and their op- 
timal atom numbers N. The most favorable up-step and 
down-step both have negative formation energies, which 
is a known artifact of the empirical potentials ! 22 ' 24 With- 
out placing undue significance on the negative sign, we 
focus on the ranking of the formation energies and the 
corresponding structures. The reconstruction of the flat 
Si(114) surface consists of rows of dimers (d), rebonded 
atoms (r) and tetramers (t) in this specific periodic se- 
quence (...-d-r-t-d-r-t-d-....) along the [221] direction.— 
Since we allowed for a large width of the step region^ 
the steps can negotiate their width and location during 
the genetic evolution. This is apparent in Fig. [3l which 
shows that the sequence of motifs (d, r, t) is continued 
seamlessly from each terrace into the step zone until the 
atomic structure and the location of the step edge are de- 
termined. The best [TlO] down-step structures (Fig. [3k,, b) 
include a row of hexagons (labelled by "h" in Fig. [3]) in 
addition to motifs already encountered on the terraces. 
Other low-energy steps are observed to simply consist of 
a gap in the -d-r-t- sequence of the structural features 
on the terraces. For example, Fig. [3fc shows a down step 
that contains dimers, rebonded atoms and tetramers in 
the correct order, but which are bonded to the upper 
(lower) terrace via elimination of one tetramer (dimer) 
row from the -d-r-t- sequence on the terraces. The most 
favorable up-step structures contain only rows of dimers 
and rebonded atoms (Fig.[3Ji), all motifs in a different or- 
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der (d-t-r, Fig. or only rows of dimers and tetramers 
(Figs. [3f). The algorithm is thus able to find narrow 
steps (Fig. [3ji) , wide ones (Fig. [3k), as well as steps with 
intermediate widths (Fig. [3£i,b,e,f). This morphological 
and structural diversity is a tell-tale sign of the supe- 
rior configuration sampling achieved here with just one 
simple genetic operation (the crossover, Fig. QJ>). 26 

To provide a closer look at the way the algorithm sorts 
through different numbers of atoms, we have plotted the 
lowest formation energy found for every number of atoms 
N attained during the evolution (Fig. Sk)- Such a plot 
shows that the algorithm can visit, in the same evolution, 
several structures of particularly low formation energies 
(magic-number atomic configurations) and select them as 
part of the genetic pool. Magic structures are found for 
even values of N both for the up and down [TlO] steps, as 
seen in Fig. Sk- The same figure shows that the forma- 
tion energy of [22T]-normal steps also has a few distinct 
local minima, which are located at odd values of N. On 
the other hand, the [22T]-shifted configuration does not 
have magic number behavior. In this case, there are two 
deep minima for the range of atom numbers spanned, 
but after the final relaxation they are found to have the 
same structure only translated by one full terrace unit 
cell£ along the [TlO] axis. 

Finally, in Fig. Sfc> we report the formation energies of 
the lowest eight structures in the pool for each of the 
four configurations studied. The figure shows that the 
steps oriented in the [TlO] direction have smaller forma- 
tions energies than those along [221] for all top ranking 
structures found. For the [TlO] direction the down-steps 
are easier to form than the up-steps, while for the [22T] 
direction the up and down steps have identical struc- 
tures and energies. The conclusion that [TlO] steps are 
more favorable than [221] steps is consistent with the gen- 
eral expectation that a direction of higher symmetry (i.e. 
[TlO]) yields lower step energies than a low symmetry 
one. Ideally, these results could be tested by recalculat- 
ing the formation energy of steps at the level of density 
functional theory (DFT) calculations. While in principle 
DFT level calculations of formation energies of steps on 
Si(114) are tractable, the number of atoms involved is at 
least several hundred atoms which is beyond our present 
computational means. 

However, we have found that existing experimental ob- 
servations do support our genetic algorithm results, al- 
beit qualitatively. Laracuente et al3 reported STM im- 
ages (reproduced here in Fig. St) in which the [TlO] steps 
are clearly preferred over the [22T] ones even when, due 
to the preparation conditions^ steps may not assume 
the very lowest-energy structure for given direction and 
terrace configuration: this is consistent with the simula- 
tion results in Fig. St" which show that [TlO] steps are 
lower in energy for metastable structures ranked within 
the first eight at the end of the genetic evolution. More 
recently, Whitman and coworkers have also imaged step 
configurations after long anneals at 450°C. These recent 
measurements (shown in Fig.Sli) are more likely to corre- 



spond to lowest-energy step structures, and again show 
that the [TlO] are longer (more favorable) thus lending 
support to our simulation results. In terms of atomic 
positions, so far there has been no proposal for the struc- 
ture of steps on Si(114). We have proposed here several 
low-energy structures (Fig. [3]) found via the genetic pro- 
cedure, structures which are amenable to experimental 
testing via high-resolution STM measurements combined 
with ab initio density functional calculations. 

The versatile variable-number algorithm described in 
previous reports on two-dimensional surface system a 18 i 20 
has recently been extended to the case of 3-D crystal 
structure prediction. 21 While in the 2-D and 3-D cases 
structure prediction methods were already availabl e) 27 ! 28 
prior to the introduction of the genetic approaches , 18 ! 21 
to our knowledge the present variable-number genetic 
algorithm is the first robust approach to atomic struc- 
ture determination for surface-supported 1-D systems. 
A reader may rightfully argue that the use of empiri- 
cal potentials could cast doubts on the results obtained 
by using this algorithm. Although empirical (or even 
tight-binding) potentials do have artifacts which lead to 
spurious minima, these minima may be accommodated 
to some extent by increasing the size of the pool. What 
makes the algorithm robust is not the specific potential 
model, but rather the concept of an evolved database 
whose optimized members (with different atom numbers 
and different structures) can be studied subsequently at 
any level of theory, including ab initio calculations. It is 
worth noting that, in fact, the reliance on empirical po- 
tential is much smaller in the present genetic algorithm 
than in the case of molecular dynamics and continuous- 
space Monte Carlo) 27 ' 28 because in the current implemen- 
tation the genetic algorithm only relies on an acceptable 
accuracy of the local minima energies without the addi- 
tional requirement of a good description of the height of 
the barriers between these minima. 

In conclusion, we have presented a general way to 
determine the structure of steps on reconstructed crys- 
tal surfaces, and applied it to find the structure of 
monatomic steps on Si(114)-2 x 1. A key finding of this 
paper is that the step structure problem can be solved 
by what is probably the simplest genetic algorithm, i.e. 
an algorithm which is based on greedy selection of new 
members structurally distinct from the old ones and on a 
single real-space operation (cross-over between two par- 
ents using planar cuts). We speculate that the reason 
for which this simple approach works is that the under- 
lying bulk crystal provides a strong template onto which 
the structure can relax while also obeying the periodic 
boundary conditions imposed. The bulk template low- 
ers the number of both the distinct and the symmetry- 
equivalent configurations that the pool members can 
visit, and thus decreases the complexity of the problem 
as compared, e.g., with the optimization of an atomic 
cluster with the same range for the number of atoms. 
The variable-number approach is particularly well-suited 
for other problems beyond that of single-height steps on 
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clean high-index Si surfaces. Structure of the edge be- 
tween two stable facets that bound a quantum dot, gold 
atom decorations of stepped surfaces,— adsorbate- 
induced surface reconstructions^ steps on compound 
semiconductor surfaces^ and structure of step bunches 
during growth (e.g. Refs. |lfjj3lj) - to give a few signifi- 
cant examples - can be studied systematically using the 
procedure presented here, with the only necessary mod- 
ifications concerning the geometry of the supercell and 



the expression of the formation energy to account for a 
second atomic species. 
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